Differences of gut microbiota and behavioral symptoms between two subgroups of autistic children based on γδT cells-derived IFN-γ Levels: A preliminary study

Background Autism Spectrum Disorders (ASD) are defined as a group of pervasive neurodevelopmental disorders, and the heterogeneity in the symptomology and etiology of ASD has long been recognized. Altered immune function and gut microbiota have been found in ASD populations. Immune dysfunction has been hypothesized to involve in the pathophysiology of a subtype of ASD. Methods A cohort of 105 ASD children were recruited and grouped based on IFN-γ levels derived from ex vivo stimulated γδT cells. Fecal samples were collected and analyzed with a metagenomic approach. Comparison of autistic symptoms and gut microbiota composition was made between subgroups. Enriched KEGG orthologues markers and pathogen-host interactions based on metagenome were also analyzed to reveal the differences in functional features. Results The autistic behavioral symptoms were more severe for children in the IFN-γ-high group, especially in the body and object use, social and self-help, and expressive language performance domains. LEfSe analysis of gut microbiota revealed an overrepresentation of Selenomonadales, Negatiyicutes, Veillonellaceae and Verrucomicrobiaceae and underrepresentation of Bacteroides xylanisolvens and Bifidobacterium longum in children with higher IFN-γ level. Decreased metabolism function of carbohydrate, amino acid and lipid in gut microbiota were found in the IFN-γ-high group. Additional functional profiles analyses revealed significant differences in the abundances of genes encoding carbohydrate-active enzymes between the two groups. And enriched phenotypes related to infection and gastroenteritis and underrepresentation of one gut–brain module associated with histamine degradation were also found in the IFN-γ-High group. Results of multivariate analyses revealed relatively good separation between the two groups. Conclusions Levels of IFN-γ derived from γδT cell could serve as one of the potential candidate biomarkers to subtype ASD individuals to reduce the heterogeneity associated with ASD and produce subgroups which are more likely to share a more similar phenotype and etiology. A better understanding of the associations among immune function, gut microbiota composition and metabolism abnormalities in ASD would facilitate the development of individualized biomedical treatment for this complex neurodevelopmental disorder.


Introduction
Autism spectrum disorders (ASDs) are defined as a group of pervasive neurodevelopmental disorders characterized by persistent deficits in social communication and social interaction, plus restricted, repetitive patterns of behavior, interests, or activities (1). The exact etiology of ASD is still unknown and accumulated results of recent studies suggest that instead of a single causative factor, ASD may be caused by the combined effects and interplay between genetic heritability and complex environmental risk factors (2)(3)(4)(5).
It has been demonstrated that certain immune-mediated conditions (such as allergies and some autoimmune diseases) were more prevalent in ASD subjects (21)(22)(23)(24). Immune dysfunction has been hypothesized to involve in the pathophysiology of a subtype of ASD (12,20,25). Compared to behavioral symptoms, immune abnormalities are more objective since they can be measured using clinical and laboratory characteristic, thus would be a potential ideal subtyping indicator (25). Gamma delta (gd) T cells play important roles in inflammatory and autoimmune diseases (26). They add to the imbalanced pro-and anti-inflammatory reactions and recruit other immune cells such as macrophages (27). IFN-g is one of the major cytokines produced by gd T cells. Results from several previous studies revealed elevated IFN-g levels in different tissues in some but not all ASD subjects (28-31), indicating that IFN-g might participate in the immune dysfunction associated with certain subtype of ASD.
The crosstalk between gut microbiota and host immune function has been increasingly recognized in recent years (32)(33)(34)(35). Gut microbiota can interact with immune cells and modulate the function of immune system, and inflammation, which is caused by abnormal immune responses, can also influence the composition of the gut microbiome (32,34). Moreover, differences in the intestinal microbial community have also been found in children with ASD as compared to neurotypicals (17,(36)(37)(38)(39). Since gut microbes can communicate with the host brain through multiple ways, including neuroactive compounds, toxin metabolites and immune modulation, it is suggested that alterations of gut-immune-brain axis play critical roles in ASD (38,(40)(41)(42)(43). A better understanding of the association and comprehensive interactions among gut microbiota composition, immune characterization and behavioral symptoms in ASD, and subtyping this heterogenous disorder based on objective immunological characteristics into more homogeneous subgroups will not only provide useful information on the biological mechanisms underlying the pathogenesis of ASD, but also facilitate the development of individualized therapy strategies for ASD population (9,18,20).
In the present study, we recruited a cohort of 105 ASD children, and levels of IFN-g derived from gd T cells were measured to cluster children into subgroups. Comparison of autistic symptoms and gut microbiota composition was made between subgroups. Enriched KEGG orthologues markers and pathogen-host interactions based on metagenome were also analyzed to reveal the differences in functional features. Results of this study will provide additional evidence to support the association of gut microbiota alterations and immune dysfunction in ASD and suggest that IFN-g could serve as a potential candidate biomarker to subtype ASD individuals into subgroups which tend to share a more similar phenotype and etiology.

Participants
The study was approved by the Institutional Review Board of Peking Union Medical College Hospital (IRB #ZS-824). Autistic children were consecutively recruited from the Herun Clinic in Beijing, China. The inclusion criteria for autistic children were (1): Being diagnosed with autism which was confirmed by experienced psychiatrists according to the Diagnostic and Statistical Manual of Mental Disorders Fifth Edition (DSM-V, 2013) criteria (2). Free of antibiotic treatment, prebiotics and probiotics for at least 4 weeks before sample collection (3). The children's primary caregivers had good reading and comprehension skills and were able to fill in the relevant assessment scales (4). The children's parents or legal guardians volunteered to participate in this study and signed the informed consent. Autistic children with symptoms of other comorbid neurological or psychiatric disorders as confirmed by experienced clinicians or psychiatrists were excluded from the study. Detailed information on the purposes and procedures of the study were explained to the children's parents or legal guardians. Written forms of full informed consent were obtained before involving the children in the study.

Assessment of autistic symptoms
The following scales were used to assess autistic symptoms in children:

Detection of IFN-g expression in gd T cells
Two milliliters of fasting venous blood samples were collected into chilled heparin tubes by trained nurses between 8:00 and 10:30 a.m. The samples were then diluted with equal volume of PBS, and the peripheral blood mononuclear cells (PBMCs) were separated by Ficoll (Tianjin Hao Yang Biological Technology Company) centrifugation.
The stimulated PBMCs were washed twice with PBS, centrifuged and then resuspended. Subsequently, CD3-PE conjugated antibody (BD Pharmingen), TCRgd-FITC conjugated antibody (Biolegend) was added to the cells. After 30 minutes of incubation at 4°C avoiding light, the cells were washed twice with PBS. The cells were then permeabilized for staining of intercellular cytokine with Cytoperm/ Cytofix Fixation/Permeabilization Kit (BD). Subsequently, cells were incubated with APC-conjugated IFN-g antibody (BD Pharmingen) for an hour. Then the cells were washed and resuspended with PBS, followed by flow cytometry assessment. Flow cytometry was performed on BD Accuri C6 flow cytometer, and the following data analysis was conducted with CFlow Plus 1.0.164.15.

Fecal sample collection and DNA extraction
Children's fresh fecal samples were obtained at home or Herun Clinic, immediately transferred into 1.5 ml sterile Eppendorf tubes (Axygen), and frozen into dry ice. All samples were stored at -80°C until analysis. DNA was extracted from fecal samples using the MO-BIO PowerSoil DNA Isolation Mini-Kit (Carlsbad) according to the protocol. DNA quality was assessed and controlled using gel electrophoresis.

Metagenomic library construction and sequencing
The sequencing library construction and template preparation was performed using the NEBNext UltraTM DNA Library Prep Kit (New England Biolabs) following manufacturer's instructions (input DNA >100 ng). Each sample was barcoded and equal quantities of barcoded libraries were used for sequencing. The quality and quantity of the libraries were assessed using the Agilent 2100 High Sensitivity DNA Kit (Agilent Technologies) and the ABI 7500 Real Time PCR System (Applied Biosystems) before Illumina sequencing. Illumina HiSeq 2500 and Hiseq X Ten sequencing systems were used for paired-end 150bp sequencing. Data with adaptor contamination and low-quality reads were discarded from the raw data. We acquired~223Gb high-quality data for the 38 samples with an average of~5.9Gb per sample.

Data analysis
Taxonomic assignment of the main bacteria and the relative species abundances were calculated using MetaPhlAn (version 1.7.7) (47). Biodiversity of the samples was processed with Vegan (version 2.4-6) in R package. The top 100 most abundance clades in each sample were selected to calculate the "Bray-Curtis" distance and the similarity between samples ( Figure S3). The linear discriminant effect size (LEfSe) analysis was performed to find features (taxa) differentially represented between groups (48).
The Short Oligonucleotide Analysis Package(SOAP2, version 2.21) (49) was used to do the alignment and retain the unique mapped reads to do the downstream analysis. The relative abundance of these (super)contigs or genomes was calculated based on the number of aligned reads normalized by the (super) contig's or genome's size. The integrated non-redundant gene catalog database about the human gut microbiome was used to do the function analysis (50) with the Kyoto Encyclopedia of Genes and Genomes (KEGG) database, the Carbohydrate-Active Enzymes database (CAZy), the Pathogen-Host Interactions database (PHIbase) and the Gut-Brain Modules (GBMs) as described in previously published articles (51)(52)(53).
Other statistical analyses were performed using Statistical Package for the Social Science version 19.0 (SPSS Inc., Chicago, Illinois) and GraphPad Prism version 5.0 (GraphPad Software Inc., San Diego, CA). Continuous data were checked for normal distribution using the Shapiro-Wilk test first. Unpaired t test or non-parametric test (for those data that were not normally distributed) was used for comparison between groups. The Spearman or Pearson correlation test was applied to explore the correlation among autistic symptoms, gut microbiota, and functional categorization. The principal component analysis (PCA), orthogonal partial least-squares discriminate analysis (OPLS-DA) and the multivariate receiver operator curve (ROC) analysis were carried out using the methods as described in the protocol (54). For all tests, a value of p<0.05 (two-tailed) was considered statistically significant. False discovery rates (FDR) were controlled at 0.05 for multiple testing using the Benjamini and Hochberg method.

Characteristics of the enrolled participants
A total of 105 individuals met the inclusion criteria were recruited, and the top and bottom quarter of the participants were selected for questionnaires and fecal microbiota analyses based on their IFN-g level derived from gdT cells. Since some of the participants were outpatients who can only spare a little time with our team, and some of the children may not defecate within these few days, not all of them have time to complete the behavioral symptoms assessment or have their fecal sample successfully collected. Those who completed all the questionnaires and fecal sample collection for metagenomic array constitute the final study samples in the present study.
Demographics of the participants were summarized in Table 1. The two groups were well matched for chronological age and sex composition. The incidences of gastrointestinal symptoms such as diarrhea and constipation showed no statistical difference between groups. As expected, levels of IFN-g derived from gdT cell were much higher in autistic children in the IFN-g-High group as compared to the IFN-g-Low group.

Differences in the severity of autistic behavioral symptoms between groups
Preliminary analysis of IFN-g levels vs ASD severity indicated in recent clinical records (graded as mild, moderate or severe) suggests a rather skewed distribution ( Figure S1).
Significant differences in ABC metrics for severity of autistic behavioral symptoms were found between the IFN-g-High and IFN-g-Low groups. Children in the IFN-g-High group had significantly higher ABC total scores (Median: 72.0, interquartile range (IQR): 59.5~84.0) than those in the IFN-g-Low group (48.0, IQR 45.0~67.0, p<0.01) ( Figure 1A).
The post hoc analyses were conducted on the subscales scores (Table 2), and scores of two subscales in ABC (Body and object use, Social and self-help) demonstrated statistical differences between the two groups ( Figures 1B, C), indicating that the related symptoms of those children in the IFN-g-High group were much severe than those in the IFN-g-Low groups.
There was no statistical difference in ATEC total scores between groups (88.5, IQR 70.5~104.8 in IFN-g-High group vs. 82.0, IQR  (Figure 2A), indicating the language function was much more impaired in the IFN-g-High group.
In order to further evaluate the children's expressive and receptive language performance respectively, the Clinical Language Status Questionnaire (CLSQ) was applied. As is shown in Figure 2B, children in the IFN-g-High group got lower expressive language performance scores (5, IQR1~6) than those in the IFN-g-Low group (7, IQR 5~8, p<0.05). However, scores of receptive language performance showed no difference between the two groups ( Figure 2C).

Differences in the fecal microbiota composition between groups
There was no significant difference in the alpha-diversity of the fecal microbiota between the two groups ( Figure S2). The Bray-Curtis dissimilarity revealed no significant difference between the two groups ( Figure S3, PERMANOVA, r 2 = 0.0276, p= 0.398). The LEfSe method was used to determine the taxa at different taxonomic levels which were enriched in the IFN-g-High and IFN-g-Low groups ( Figure 3). Results of the LEfSe analysis revealed underrepresentation of Bacteroides xylanisolvens and Bifidobacterium longum in the IFN-g-High group (p<0.01, Wilcoxon rank-sum test; LDA>3.0). Overrepresentation of several phylotypes were also found in the IFN-g-High group, and Selenomonadales, Negatiyicutes and Veillonellaceae were the top 3 enriched phylotypes with LDA<-4.0.

Differences in functional profiles from the metagenomic data between groups
From the metagenomic data, the KEGG orthologues markers that were different between the IFN-g-Low and IFN-g-High groups were analyzed. The relative abundance of the KEGG orthologues markers related to amino acid metabolism, carbohydrate metabolism and lipid metabolism were found to be decreased in Differences of ABC total and subscales scores between the IFN-g-Low and IFN-g-High groups.   the IFN-g-High group as compared to the IFN-g-Low group with values of p<0.05 ( Figure 4). And the reported differences remained significant after applying the FDR. In order to further explore the possible mechanisms underlying the differences relating carbohydrate metabolism between groups, the abundances of genes encoding carbohydrate-active enzymes (CAZymes) in the fecal microbiome were quantified. CAZymes were annotated by their family as defined in the CAZy database. Significant differences after applying the FDR in the abundances of genes encoding six CAZymes families were found between the two groups. For children in the IFN-g-High group, the relative abundances of genes encoding GlycosylTransferase Family 56 (GT56), Polysaccharide Lyase Family 13 (PL13) and Polysaccharide Lyase Family 8 (PL8) was lower, while the relative abundances of genes encoding Carbohydrate Esterase Family 10 (CE10), Glycoside Hydrolase Family 95 (GH95) and GlycosylTransferase Family 28 (GT28) was higher as compared to those in the IFN-g-Low group ( Figure 5).
The PHI-base phenotypes related to infection (Pathogen gene: purT, Host species: Homo sapiens) and gastroenteritis (Pathogen gene: flhF, Host species: Homo sapiens) were found to be significantly enriched in the IFN-g-High group (Figures 6A, B). Additionally, underrepresentation of one gut-brain module (MGB010) associated with histamine degradation was also found in the IFN-g-High group ( Figure 6C).

The correlation analysis
The Spearman correlation analysis was applied to explore the relationship among IFN-g, autistic behavioral symptoms, gut microbiota and functional modules which were significant in univariate analysis. As is shown in the matrix in Figure 7, the relative abundance of Bacteroides xylanisolvens, which was enriched in the IFN-g-Low group, was negatively correlated with IFN-g level (rho=-0.434, p<0.05), ABC total and subscales (Body and object use, Social and self-help) scores (all p<0.05), and the relative abundance of GlycosylTransferase Family 28 (GT28) (rho=-0.535, p<0.05), and positively correlated with CLSQ expressive language performance scores (rho=0.353, p<0.01). Additionally, the relative abundance of GlycosylTransferase Family 28 (GT28) was negatively correlated with ABC language scores (rho=-0.326, p<0.05), while the relative abundance of Polysaccharide Lyase Family 8 (PL8) was positively correlated with ABC language score (rho=0.303, p<0.05). Among these above correlations, only the correlation between the relative abundance of Bacteroides xylanisolvens and CLSQ expressive language performance scores remained significant after applying the FDR.

Multivariate analysis and potential discriminating features analysis
Since univariate approaches ignore the correlations among variables as demonstrated in Figure 7, multivariate analyses were applied because these methods simultaneously take all variables into consideration. The principal component analysis (PCA) scores plot revealed that samples in the IFN-g-Low group were more concentrated as compared to the scattered pattern of the IFN-g-High group ( Figure 8A). And as is shown in Figure 8B, the scores plot constructed using orthogonal partial leastsquares discriminate analysis (OPLS-DA) revealed relatively good separation between the IFN-g-Low and IFN-g-High groups (Q2 = 0.469, p<0.05; R2Y=0.726, p<0.05).
The algorithm of the random forests was used to perform potential discriminating features analysis. The Monte-Carlo cross validation (MCCV) was applied to identify models with good performance. In each MCCV, two thirds (2/3) of the samples are used to evaluate the feature importance. The top 2, 3, 5, 10… important features are then used to build classification models which is validated on the 1/3 the samples that were left out. The procedure were repeated multiple times to calculate the performance and confidence interval (CI) of each model. Based on the cross validation, the multivariate models using 10 variables achieved an AUC of 0.835 ( Figure 8C). The top 10 significant discriminating features ranked based on their frequencies of being selected during cross validation are listed in Figure 8D.

Discussion
In the present study, a cohort of 105 ASD children were recruited and ranked based on their IFN-g levels derived from gdT cells. The top 25% and bottom 25% of the participants were selected which constituted the final two groups, respectively. Our results demonstrated that autistic behavioral symptoms of children in the IFN-g-high group were more severe, especially in the body and object use, social and self-help, and expressive language performance domains. The LEfSe analysis of gut microbiota revealed some bacterial taxa differentially abundant between groups. Decreased metabolism function of carbohydrate, amino acid and lipid in gut microbiota were found in the IFN-g-high group. Additional functional profiles analyses also revealed significant differences in the abundances of genes encoding carbohydrate-active enzymes between groups. And enriched A B C FIGURE 6 Differences of the abundances of genes related to pathogen-host interactions and gut-brain modules in the fecal microbiome between the IFN-g-  The Spearman correlation matrix among IFN-g, autistic behavioral symptoms, gut microbiota and functional modules which were significant in autistic children. Color intensity reflects Spearman correlation coefficient. *p<0.05, **p<0.01, ***p<0.001.
phenotypes related to infection and gastroenteritis and underrepresentation of one gut-brain module associated with histamine degradation were also found in the IFN-g-High group. Results of multivariate analyses revealed relatively good separation between the two groups and suggest that IFN-g could serve as a potential candidate biomarker to subtype ASD individuals into more homogeneous subtypes. Currently, the diagnosis of ASD is still made mainly based on behavioral symptoms (1). And the concept of spectrum suggests that individuals with ASD may present with diverse sets of symptoms that vary widely from one individual to another (1,55). The symptom diversity may be caused by many different factors, and this heterogeneity brings about great difficulty for researchers to elucidate the anticipated etiology or risk factors for ASD, because it would not be expected that a same etiological factor would explain two vastly different phenotypes (56, 57). It has now been well recognized that researchers should subtype these individuals within the spectrum to reduce the diversity and use a more homogeneous subtype to study the biological mechanism and explore effective treatment strategies (58). Previous subtyping strategies are mostly defined by some particular symptom characteristics, such as social behavior or language ability (6,7,9,10). Another feasible approach is using biomedical features to stratify samples to reduce heterogeneity and produce subgroups which are more likely to share a more similar phenotype and etiology (11,12,59). Compared to the behavioral symptom characteristics, using biomedical features as subtyping indicators has some advantages, because they are more objective and easily to measure, more directly to indicate the possible mechanisms underlying the associated heterogeneity, and could also provide useful information to further explore the potential targets to facilitate the development of individualized biomedical therapy strategies for certain ASD subtypes.
Immunological involvement in the pathophysiology of certain subtypes of ASD has long been hypothesized and accumulated results from both clinical and animal research have identified the associations between immunologic function abnormalities and ASD (12,(21)(22)(23)(24)(60)(61)(62). Moreover, clinical trials using immunemodulating or anti-inflammatory drugs in individuals with ASD also yield promising results, and the treatment responses were especially better for those with immunological or gastrointestinal disturbances (12,(63)(64)(65)(66). Results of these previous studies suggest that biological characteristics relating to immune function may serve as potential biomarkers to reduce the heterogeneity in ASD and to improve the prediction of response to certain biomedical treatments (12). In the present study, we choose IFN-g derived from gdT cells as subtyping biomarkers, because gdT cell intrinsically combines innate immunity and adaptive immunity and plays important roles in inflammatory and autoimmune diseases, which were found to be more prevalent in ASD individuals (21)(22)(23)(24)67). And results of previous studies also indicated IFN-g might play a role in the progression and exacerbation of autistic symptoms (28-31). Changes of INF-g levels have been found in blood samples and brain tissues of ASD subjects, and animal studies also confirmed upregulation of INF-g in animals with autistic-like behaviors (31,68,69). It has been demonstrated that plasma levels of INF-g correlated positively with plasma nitric oxide measures in ASD group and the higher NO production in ASD children may be secondary to IFN-g mediated up-regulation of the inducible nitric oxide synthase (iNOS) (70). INF-g may interact with gut microbiota and PBMCs taken from ASD subjects produced elevated levels of IFN-g against common dietary proteins (71). High levels of INF-g were also associated with a reduction in glucocorticoid receptor (72), which might result in excessive circulation of glucocorticoid, and the excessive glucocorticoid are well-known as neurotoxins (73). Although the direct solid evidence is still lacking, these findings support the hypothesis that INF-g may play a role in the pathologic mechanism of ASD.
However, results of the INF-g levels in ASD from the previous studies were not always consistent. Both higher and lower levels of INF-g have been found in blood samples and PBMCs of ASD [see the summarized results in the excellent systematic reviews (74, 75)]. In our clinical practice, we also find that a great heterogenicity exists in the INF-g levels in ASD. Levels of INF-g are very high in a portion of ASD children, and their behavioral symptoms seems to be different from other ASD children. So, we hypothesized that within the heterogeneous broad spectrum of ASD, those ASD children with high INF-g levels may represent a subgroup whose autistic symptoms and gut microbiota composition may be different from others. And the results of the present study turned out to support our initial hypothesis.
When comparing the autistic behavioral symptoms between the two subgroups selected based on levels of IFN-g derived from gdT cells, children with higher levels of IFN-g got significantly higher scores in ABC, especially for the body and object use subscale and the social and self-help subscale. Additionally, children in this group also got higher scores in the speech/language/communication subscale in ATEC. Since there were some discrepancies as assessed by ABC and ATEC questionnaires in the language domain, and the expressive and receptive language abilities were evaluated with different weights but calculated as a whole in these two questionnaires (44, 45), the CLSQ was used to further assess children's expressive and receptive language performance respectively (46). The results revealed that only the expressive language performance was significantly impaired in the IFN-g-High group. All these results suggest that autistic behavioral symptoms were different between the IFN-g-High and IFN-g-Low groups, and children with higher levels of IFN-g may suffer from more severe symptoms of ASD.
Since there exist intense interactions between gut microbiota and immune function, and alterations of gut-immune-brain axis has been suggested to act critical roles in the pathogenesis of ASD (32-35, 41, 42), differences in gut microbiota composition between the two groups were also analyzed. The most significant characteristic difference between the two subgroups is that Negativicutes, Selenomonadales and Veillonellaceae were more enriched in the IFN-g-High group, with the LDA score less than -4. Different abundances of these three bacterial taxa were also found between autistic and neurotypical subjects in several other independent studies (23,76,77). Indeed, the family Veillonellaceae belongs to the order Selenomonadales within the class Negativicutes, and they are all members of the phylum Firmicutes (78). Species of Firmicutes could upregulate IFN-g production and significant increased ratio of Firmicutes/Bacteroidetes has been reported associated with not only with ASD (77, 79), but also with other conditions that were found to be more prevalent within ASD subjects, such as obesity and diabetes (24,80,81).
The relative abundances of the species of Akkermansia muciniphila, Pyramidobacter piscolens, and Anaerotruncus colihominis were also found to be more enriched in the faces of ASD children in the IFN-g-High group. Akkermansia muciniphila is a mucin-degrading bacterium, which has been suggested to play a role in inflammation and gut permeability (82,83). Lower relative abundances of Akkermansia muciniphila has been found in feces of autistic children, which might reflect an indirect evidence of a thinner gastrointestinal mucus barrier in ASD children (83). Interestingly, there are also studies that found Akkermansia was present at higher relative abundances in feces of ASD individuals (76,84), or even at very high levels (up to 59%) in several autistic individuals (23). Results from these previous studies suggest that great diversity in the abundances of Akkermansia muciniphila may exist among different ASD individuals. In the present study, we found that within the spectrum of autism, there do exist significant differences in the relative abundances of Akkermansia muciniphila between ASD children in the IFN-g-High and the IFN-g-Low groups. Pyramidobacter piscolens is one of the members of the phylum Synergistetes. It was first isolated from human oral cavity (85) and is related to oral dysbiosis, which may result in periodontal diseases and abscess (86). Oral dysbiosis and these oral health conditions are also found to be more common in ASD children (87). Further studies revealed that Pyramidobacter piscolens could also be cultured from small intestine abscess, and it is now considered that Pyramidobacter piscolens is part of the commensal human microflora which plays a functional role but may also act as opportunistic pathogens (88). Additionally, Pyramidobacter piscolens is one of the core species which can regulate lipid deposition (89) and may influence blood glucose metabolism (90). Anaerotruncus colihominis belongs to phylum Firmicutes. It is a short-chain fatty acids (SCFA) producing species which is presumed to be anti-inflammatory and is related to autoimmunity (91)(92)(93)(94). The abundance of Anaerotruncus colihominis was found to be negatively associated with cognitive function scores in patients with Alzheimer's disease (95). Significant lower abundances of Anaerotruncus colihominis has also been found in patients with rheumatoid arthritis (RA) (94), and a number of clinical and basic studies have demonstrated roles of IFN-g in the pathogenesis of RA (96). It has also been reported that Anaerotruncus colihominis possesses the ability to produce acetic and butyric acids (91), which could have a role in regulating gut epithelial barrier function and play possible roles in ASD (97,98). Although both of the Pyramidobacter piscolens and Anaerotruncus colihominis play functional roles in the metabolism of bioactive compounds which is perturbed in ASD, studies of the direct roles of the two species in the pathophysiology of ASD is rare. Our results demonstrated that there were significant differences in the relative abundances of Pyramidobacter piscolens and Anaerotruncus colihominis between ASD children in the IFN-g-High and the IFNg-Low groups, and the biological significance of these findings warrants further research.
The results of the LEfSe analysis also revealed underrepresentation of Bacteroides xylanisolvens and Bifidobacterium longum in the IFN-g-High group. These two bacterial species are both non-pathogenic and process many probiotic qualities (99)(100)(101). Bacteroides xylanisolvens belongs to the second most abundant genus Bacteroides in the human intestine and they can break down many sugars including dietary fiber and xylan (102,103). It has been demonstrated that some strains of Bacteroides could modulate the function of innate immune system (104) and have the potential to relieve some behavioral and physiological abnormalities associated with ASD (105, 106). Bifidobacterium longum is considered to be one of the earliest colonizers of the gastrointestinal tract in infants (107). The domination of Bifidobacterium in infant's gastrointestinal tract could hinder pathogenic organisms' colonization through antimicrobial activity and competitive exclusion manners (108). Bifidobacterium longum could also serve as a scavenger because it metabolizes a large variety of substrates including bile salts, human milk oligosaccharide and some other complex oligosaccharides (107,109,110). The efficacy of Bifidobacterium longum in regulating immune (including its ability to suppress the expression of IFN-g in vivo) and central nervous system functions and alleviating psychiatric disorder-related behaviors including ASD and obsessive-compulsive disorder has also been demonstrated (100,101,111). It is worth mentioning that Bacteroides and Bifidobacterium species were also found to be depleted in ASD children in other independent cohort studies (76,83,112,113). Associations between gut microbiota and ASD certainly warrant further studies to elucidate a causation role in the pathogenesis of ASD. However, the consistency of these results across different ethnic groups using different sequencing and assay methods, together with their efficacy in alleviating autistic symptoms, strongly suggest that the loss of representation of these bacterial taxa is very robust and may be tightly associated with the pathophysiology of ASD.
For the predicted KEGG pathway analysis results, we found that the IFN-g-High group was less enriched in pathways related to amino acid metabolism, carbohydrate metabolism and lipid metabolism. As key partners involved in the maintenance of human physiology and health, gut microbes influence greatly on host metabolism and help balance important vital functions such as food digestion and nutrient bioavailability for the host (114). The relatively depleted pathway orthologues markers related to metabolism of amino acid, lipid and carbohydrate in the IFN-g-High group suggested that children in this subgroup may have higher risks of suffering from more sever metabolic dysfunction. Indeed, a great quantity of work has shown that children with ASD have perturbed metabolism as compared to neurotypical children (112,(115)(116)(117)(118)(119)(120)(121)(122)(123). For the amino acid metabolism, altered amino acid profile has been found in blood plasma (116,117), urine (118,119) and fecal (112) samples collected from ASD individuals. And it was postulated that gut microbial metabolism of phenylalanine and tyrosine may be involved in the pathogenesis of autism (120). Impaired carbohydrate digestion (121) and lipid metabolism (122) were also found in ASD individuals. The abundance of affected bacterial phylotypes in the intestines or duodenum of ASD individuals was found to be associated with expression levels of disaccharidases and transporters, which is important for carbohydrate digestion and transport (121, 123). Since Bacteroides spp. and Bifidobacterium spp. are specialized as primary and secondary degraders in the metabolism of complex carbohydrates (124), the depleted species of Bacteroides and Bifidobacterium in the IFN-g-High group may impact the carbohydrate metabolism capability. Additionally, as is demonstrated in this study, the abundances of genes encoding six families of carbohydrate-active enzymes in the fecal microbiome were significantly different between the IFN-g-High and IFN-g-Low groups, this may partly explain the possible mechanisms underlying the differences relating carbohydrate metabolism between the two groups. Furthermore, some bacterial species possess the ability to ferment dietary carbohydrates into the production of short chain fatty acids (SCFAs) (125). SCFAs can readily cross the gut-blood and blood-brain barriers and induce widespread effects on gut and brain via impact on epithelial barrier integrity, neurotransmitter synthesis and immune modulation (126)(127)(128). Since some of the metabolites such as Omega-6 (n-6) and Omega-3 (n-3) polyunsaturated fatty acids (PUFA) are essential nutrients for brain development and function, these metabolic alterations may be associated with the severity of autistic symptom (122,129,130). All these results further support the notion that ASD is a pervasive developmental disorder with multisystem dysfunction and metabolic disturbance.
Functional profiles analyses from the metagenomic data in our study also revealed that the abundances of genes related to infection (Pathogen gene: purT, Host species: Homo sapiens) and gastroenteritis (Pathogen gene: flhF, Host species: Homo sapiens) were significantly enriched in the IFN-g-High group. Gastrointestinal disorders are one of the most common medical conditions that are comorbid with ASD, and these comorbidities can cause greater severity in autistic symptoms (131). The results from our study further suggest that children in the IFN-g-High group may suffer from higher incidence or severity of infection and gastroenteritis, but these results need to be further validated with medical examination. Another significant difference is the underrepresentation of the gutbrain module (MGB010) associated with histamine degradation in the IFN-g-High group. Altered expression of histamine signaling genes has been found in ASD populations (132), and antagonism of histamine receptors could reduce autistic behavioral symptoms in ASD individuals and several relevant animal models (132)(133)(134)(135). Moreover, histamine receptor antagonists can suppress IFN-g production (136), while IFN-g can also modulate histamineinduced IL-6 and IL-8 production (137). Our data suggests the histamine degradation capability in fecal microbiota were much more impaired in children with higher levels of IFN-g, and the decreased capability of histamine degradation may partly affect the autistic behavioral symptoms in ASD children.
For the correlation analysis, the relative abundance of Bacteroides xylanisolvens showed most significant relationships with not only several autistic behavioral characteristics, but also with one of the carbohydrate-active enzymes families (GT28). Additionally, Bacteroides xylanisolvens was the top discriminating features in the multivariate models using the random forests algorithm, suggesting its importance in the separation of the two groups. Our results of the multivariate analysis indicate that although both of the two groups are within the spectrum, they can be separated using the IFN-g as indicator to obtain subtypes with more similar features. Based on the cross validation, the ROC curves built using 10 variables achieved an AUC of 0.835. In this study, the ROC curves were generated by MCCV using balanced sub-sampling. In each MCCV, two thirds (2/3) of the samples were used to evaluate the feature importance. The top 2, 3, 5, 10… important features were then used to build classification models which were validated on the 1/3 samples that were left out (54). Since more variables consistently leads to better prediction, and due to the relatively small sample size in this study, there exists a risk of overfitting. Therefore, it is important to evaluate the models with a large number of samples to estimate their generalizability with high confidence.
Since INF-g level varies widely within the heterogeneous broad spectrum of ASD, and as is shown in our study, the behavioral symptoms, gut microbiota composition and some metabolic features of ASD children in the INF-g-High group were different from those in the INF-g-Low group, utilization of this information to segregate ASD children into different subgroups will greatly facilitate the pathophysiology study of a more homogeneous clusters of ASD in the future. Additionally, although there still lack of solid evidence, several anti-inflammatory compounds (such as Palmitoylethanolamide, celecoxib, flavonoid luteolin) have been studied to investigate their effect as an adjunctive therapy in improving behavioral symptoms in autistic individuals (64, 138, 139). Since INF-g is an important proinflammatory cytokine involved in ASD but varies widely within the heterogeneous spectrum, we believe that using these anti-inflammatory drugs in ASD subgroup with high INF-g levels will yield more promising and consistent results.
As a preliminary study, there are several limitations ought to be mentioned. Firstly, only children with ASD were enrolled in this study, lacking typically developing children as controls, and the comparison of these obtained results with a control group can be informative. Secondly, the sample size in this study is relatively small, which may decrease the statistical power, and there exists a risk of overfitting for the MCCV model. Results of this study need to be validated in an independent larger cohort. Thirdly, additional risk factors (such as having a close relative with ASD, very low birth weight, and complications at delivery) were not collected from the participants in this study. The results should be interpreted with caution due to the observational nature of the present study. Also, consistent with the sex ratio of ASD, participants were mostly males, which limited the analyses of sex differences. Finally, only IFN-g was measured in this study without testing other cytokines such as interleukin and TNF, which limits the ability to explore the full picture of immunological profiles and characteristics in ASD children. Results of this study demonstrate only associations but not causations. Further studies are warranted to reveal the cause-effect relationships among IFN-g levels, gut microbiota composition and autistic behavioral symptoms.
Despite of these limitations and the preliminary nature of this study, our results suggest that levels of IFN-g derived from gdT cell could serve as one of the potential candidate biomarkers to subtype ASD individuals to reduce heterogeneity and produce subgroups which are more likely to share a more similar phenotype and etiology. Our results also further support the notion that there exits comprehensive and complex interaction among gut microbiota, immune function and autistic phenotypes. And a better understanding of the associations between immune function and gut microbiota composition as well as metabolism abnormalities in ASD would provide us deep insights into the pathogenesis of ASD and give us important clues to facilitate the development of systemic biomedical treatment for this complex neurodevelopmental disorder.

Data availability statement
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found below: https://www.ncbi.nlm.nih.gov/, PRJNA530620.

Ethics statement
The studies involving human participants were reviewed and approved by the Institutional Review Board of Peking Union Medical College Hospital. Written informed consent to participate in this study was provided by the participants' legal guardian/next of kin.
Author contributions X-JX, XY, X-FZ, and GT, conceptualization. X-JX, J-DL, and JY, methodology. J-DL and BL, validation. X-JX, J-DL, JY, and X-DL, formal analysis. X-JX, JY, and XY, investigation. X-JX and JY, writing -original draft preparation. X-JX, J-DL, and XY, writing-review and editing. X-JX and J-DL, visualization. XY and X-FZ, supervision. X-JX and XY, project administration and funding acquisition. All authors contributed to the article and approved the submitted version.